A Model of the Statistical Power of Comparative Genome Sequence Analysis

نویسنده

  • Sean R Eddy
چکیده

Comparative genome sequence analysis is powerful, but sequencing genomes is expensive. It is desirable to be able to predict how many genomes are needed for comparative genomics, and at what evolutionary distances. Here I describe a simple mathematical model for the common problem of identifying conserved sequences. The model leads to some useful rules of thumb. For a given evolutionary distance, the number of comparative genomes needed for a constant level of statistical stringency in identifying conserved regions scales inversely with the size of the conserved feature to be detected. At short evolutionary distances, the number of comparative genomes required also scales inversely with distance. These scaling behaviors provide some intuition for future comparative genome sequencing needs, such as the proposed use of "phylogenetic shadowing" methods using closely related comparative genomes, and the feasibility of high-resolution detection of small conserved features.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparative bioinformatics analysis of a wild diploid Gossypium with two cultivated allotetraploid species

Background: Gossypium thurberi is a wild diploid species that has been used to improve cultivated allotetraploid cotton. G. thurberi belongs to D genome, which is an important wild bio-source for the cotton breeding and genetic research. To a certain degree, chloroplast DNA sequence information are a versatile tool for species identification and phylogenetic implications in plants. Different ch...

متن کامل

Comparative Analysis of Financing Methods in Thermal Power Plants in Iran

Electricity has been considered as a pivotal product for sustainable development. As Iran has access to the high level of fossil fuel resources, the major proportion of Iranchr('39')s power generation belongs to thermal power plants. In this regard, the issue that all governments have faced is financing the initial capital costs of these projects, which plays a crucial role in providing a conti...

متن کامل

Comparative Analysis of Financing Methods in Thermal Power Plants in Iran

Electricity has been considered as a pivotal product for sustainable development. As Iran has access to the high level of fossil fuel resources, the major proportion of Iranchr('39')s power generation belongs to thermal power plants. In this regard, the issue that all governments have faced is financing the initial capital costs of these projects, which plays a crucial role in providing a conti...

متن کامل

Comparing Different Methodologies Used To Ensure the Security of RFID Credit Card: A Comparative Analysis

The use of Radio Frequency Identification (RFID) advancement is turning out to be rapidly transversely over an extensive variety of business undertakings. Engineers apply the development not simply in customary applications, for instance, asset or stock after, also in security organizations, electronic travel papers and RFID-embedded card. In any case, RFID development moreover brings different...

متن کامل

Genome-Scale Metabolic Network Models of Bacillus Species Suggest that Model Improvement is Necessary for Biotechnological Applications

Background: A genome-scale metabolic network model (GEM) is a mathematical representation of an organism’s metabolism. Today, GEMs are popular tools for computationally simulating the biotechnological processes and for predicting biochemical properties of (engineered) strains.Objectives: In the present study, we have evaluated the predictive power of two ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • PLoS Biology

دوره 3  شماره 

صفحات  -

تاریخ انتشار 2005